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A Program of Longitudinal Research on the Higher Educational System 
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Robert J. Panos 
John A. Cr eager 

American Council on Education 

The Office of Research of the American Council on Education has 
recently undertaken a large-scale program of longitudinal research on 
the higher educational system. The major objectives of this program are 
to assess the impact of different college environments on the student' s 
development and to provide a source of current, readily available descrip- 
tive information about the population of college students. A pilot study 
involving 42,061 entering freshmen at 61 institutions was conducted in the 
fall of 1965; the full-scale study of entering students in a representative 
sample of 300 institutions was begun in 1966. The purpose of this report 
is to present a detailed analysis of the rationale, design, current status, 
and possible applications of th - * 3 program of research. 

The past few years have seen a significant increase in the number 
of inter institutional studies in higher education primarily because of 
the ease with which quantities of data can be collected and summarized, 
and because institutional administrators have been extremely cooperative. 
Most of these studies, however, have used biased or accidental samples 
of students and institutions. Many have merely been adjuncts to ongoing 
testing or scholarship programs. In both cases, these projects have 
tended to focus on specialized concerns, without viewing their possible 
contribution to the higher education system. Studies of fiscal and ad- 
ministrative practices, for example, have generally failed to deal 
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directly with the impact of these practices on student development. 
Similarly, most studies of underachievers and dropouts have been con- 
cerned exclusively with student characteristics, and have not attempted 
to incorporate environmental data into the analyses. 

Many of these project-oriented research studies are extremely 
expensive inasmuch as their data files are of limited usefulness in 
further research. Because of differences in measurement instruments, 
sampling techniques, and methods of subject identification, the data 
from the different investigations are seldom interchangeable, and the 
researcher initiating a new project typically starts his data collection 
from scratch. In addition to the duplicative costs of new starts and 
the excessive use of students' time, this practice means that each new 
longitudinal study requires an unnecessarily long time to complete. 

The initial goal of the American Council on Education's research 
program is to create and maintain a comprehensive file of longitudinal 
data from a representative sample of higher education institutions. The 
research data file has been designed to include the following features: 
a representative sample of institutions; comprehensive data concerning 
students, faculty, environments, and administrative policies; and lon- 
gitudinal data that can be merged with data collected by other investi- 

* 

gators. This file will provide the frame of reference for a continuing 
series of longitudinal studies of the higher education system and, it 
is hoped, will also serve as a basis for cooperation and coordination 
of activities among research organizations and individuals concerned 
with the study of higher education. 
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Data Files 

Most of the large research data files currently in operation or 
under development are of two types. In the first, most common type, the 
information is accumulated for some immediate and practical purpose rather 
than for educational research. For example, the results from the millions 
of achievement and ability tests administered each year to high school 
juniors and seniors are used for counseling and for selection or screen- 
ing purposes. Although such programs have been in operation for many 
years, their research potential has only recently been considered (Astin, 
1965a). The principal limitations of such files are that the samples of 
subjects and institutions tend to be biased, and that the operational 
goals of the program restrict the type and amount of research- oriented 
data that can be collected. 

The second type of large data file now in use serves primarily 
as a repository or library. Here the investigator' s main concern is to 
establish a clearinghouse or a central repository for all the available 
data pertaining to a given topic. However, since the investigator in 
this case is largely dependent on data collected by other researchers, 
his files are usually marred by unrepresentative samples that overlap 
only partially and by large gaps in information. Clearly, the most useful 
data file for interinstitutional studies in higher education is one that 
is designed from the beginning as a tool for research purposes. 

Previous experience with large files of data has suggested certain 
specifications for a researc’: data file. A minimum requirement is that 
the information be stored as an ordered set of records. These records 
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shculd in turn be organized in such a way that they can be retrieved 
with a minimum of effort for subsequent analyses. In addition, the units 
of sampling— whether they are individuals or institutions— must be ade- 
quately identified ■, Finally, the anonymity of individuals and institu- 
tions must be protected. 

A set of records may be called a data "bank," "base," or "registry," 
depending to some extent on which organization is proposing or developing 
it. The idea of a data bank, recently popular in educational research, 
is not new, for it shares many features with other information systems 
developed in military and medical research and in business data processing. 

Data Bases 

In the case of data "bases" (a term drawn from military contexts), 
a primary concern is with updating information on a system. One such ex- 
ample is the master personnel file of the U.S. Air Force, in which are 
maintained detailed data on all enlisted, officer, and reserve personnel. 

It is used for identification and selection, for reassignment of personnel, 
for manpower studies, and for other management purposes. Similarly, the 
SAGE early warning radar system monitors information on all incoming 
flights. A characteristic of these data bases is that the information 
becomes outmoded as time passes and must be deleted from the system. 

This characteristic reflects a "static" conception in that the immediate 
concern is with a description of the way things are at a given moment. 

The data in such a file may be conceived as "descriptive" inasmuch as it 
provides information about a referent (individual, institution, etc.) in 

For example, a person has a particular score on a 



an absolute sense. 
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given test or has passed or failed a particular item, or an institution 
falls into an arbitrary category such as "nonsectarian." 

Many business data files are basically descriptive in character. 

For example, the American Airlines Saber System utilizes a central com- 
puter to store information about all seats on all planes in the system. 

When a purchase or seat reservation is made, the event is recorded and 
an adjustment made in the available inventory. In addition, information 
is placed in the file giving the name, address, and telephone number of 
the individual making the transaction. Similarly, cancellations and stand- 
by requests are deleted and inserted into the file on a "real-time" basis. 

Data Registries 

Data "registries," in contrast to data bases, are characterized 
by the need to maintain an historical record over a period of time: that 

is, a registry contains longitudinal records for all individuals within a 
data file. In this case, the primary concern is with updating information 
on an individual. For example, consider the characteristics of psychiatric 
registries. The intended or "risk" population may be defined as those in- 
dividuals who contact a member of the psychiatric profession; a longitu- 
dinal record (including demographic and socioeconomic data) is maintained 
in the registry for every individual who has contact with a psychiatric 
facility. 

The individuals (in this case, patients) who enter the system are 
defined according to their contact through admission to or discharge or 
transfer from some kind of psychiatric facility. The registry is used 
for mental health research, planning, and evaluation. The major concern 
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is with maintaining information over an extended period of time rather 
than deleting it as soon as possible. Such a system reflects a "dynamic" 
orientation, where the concern is with the evaluation of change or improve- 
ment rather than with a description of the present condition. 

Base or Registry ? 

The data file developed at the American Institute for Research 
from the Project Talent study (1964) represents still another kind of 
conception and a very different orientation. In such files information 
has been accumulated in the course of a large educational research 
project and is stored for future use. In the sense that the information 
reflects a representative description of the population being studied at 
the time of the data collection, it can be characterized as a data "base." 
On the other hand, the individuals in the file who are followed up are 
like entries in a data registry. Furthermore, although the data base 
becomes more and more outdated with time (the data for Project Talent were 
collected in 1960), the original records are not deleted from the system 
so that they may remain available for further follow-up studies. In the 
context of an extended longitudinal study we are thus faced with the 
prospect of a data file in which the individual records increase indefi- 
nitely in size. 

The time and expense involved in maintaining so voluminous a data 
file become all the more unjustifiable when one considers the high prob- 
ability that much of the original data base will be of little or no value 
in the future. Thus, some of the information contained in the individual 
record could be deleted from the system. Yet the present state of the 
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art, at least in educational research, does not permit us to determine 
which items will prove most useful in the future. Therefore, as our 
knowledge increases, we must be alert to determine which information has 
become outmoded and can be deleted from the file. 

The ACE Longitudinal File 

The American Council on Education' s research data file has been 
designed to incorporate the best features of both a data base and a data 
registry. The data files will include longitudinal records on the in- 
stitutions and individuals selected, as well as current descriptive in- 
formation on the population. The master data file for this program of 
research will incorporate all pertinent data concerning higher educational 
institutions. Whenever relevant, these data will be collected on a con- 
tinuing basis to keep the base characteristic of the file as current as 
possible. Figure 1 (see following page) displays the independently acces- 
sible data files available from the research program. 

In the following sections we shall discuss the sampling design, 
the types of data to be included, the organization and structure of the 
data file, and the conceptual framework for our program of research. 

Sampling Design 

The primary sampling unit in the research program is the institu- 
tion. In order to include all institutions of higher education— univer- 
sities, colleges, junior colleges, and even nonaccredited institutions— 
the defined population consists of all eligible institutions listed by 
the U.S. Office of Education in its 1965-66 Education Directory , JPart 3 , 
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Higher Education . ’’Eligible," here means that the institution is func- 
tioning at the time of the survey and has the equivalent of a "freshman" 
class with at least 30 members. The latter requirement eliminates insti- 
tutions that require one or more years of undergraduate college- level 
work for admission to their "first" class. It also eliminates some very 
small institutions, the growth of which may bring them into the defined 
population in subsequent years of the program. Under these restrictions, 
the eligible population consists of 1,968 of the 2,281 institutions listed 
in the 1965-66 Education Directory . 

Stratification of Institutions in the Population 

The primary goal in the sampling design was to minimize random 
errors in order to ensure that the sample was representative of the defined 
population. Considerations such as costs and logistics, however, led to 
the adoption of a complex or mixed- strategy design for the research program. 
As a compromise between the need to reduce costs and the requirement of 
representativeness, a sample size of about 300 institutions (approximately 
15% of the eligible population) was used. 

The major control of sampling error is achieved by stratification 
of the population of institutions along dimensions that are known to rep- 
resent important functional characteristics of the institutions. Random 
selection of institutions within different levels of these dimensions 
thus increases the representativeness of the sample. Although the choice 
of dimensions for stratification of the population of institutions is 
ideally determined by the relevance of the various dimensions to control 
of error, the alternatives are necessarily limited by what information 



is available. 
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Astin (1962a) has shown that institutional size (total full-time 
enrollment) and affluence (per-student expenditures for educational and 
general purposes) account for the major portion of the variation among 
four-year institutions in selectivity, financial characteristics, level 
of faculty training, and curriculum. These two institutional variables 
are also highly related to the college environment (Astin, 1963a, 1965b; 

Astin and Holland, 1961) and to the characteristics of the entering 
students (Astin, 1965c). Affluence is more strongly related to these 
other factors than size is. Although a measure of size is available on 
all four-year institutions, information from which affluence can be com- 
puted (educational and general expenditures) is available only for most 
of the regionally accredited four-year institutions (Cartter, 1964). 

These early studies provided the rationale for stratification of 
all four-year institutions. First, the 1,375 eligible four-year institutions 
were separated into colleges (n = 1,202) and universities (n = 173). (This 
dichotomy not only is administratively meaningful, but also exerts consider- 
able control over the size dimension.) Next, both groups were separated 
into 10 levels of affluence ("less than $750" per student, proceeding in 
$250 steps to "$2,500 or more" per student, plus an "unknown" category). 

The sampling design involves different stratification procedures 
for the two-year and four-year institutions, respectively. These two 
groups thus define the first major stratum in the eligible population, not 
only because they represent an important functional dichotomy, but also 
because recent research indicates that different bases for further strati- 
fication of the two groups should be employed. Richards, Rand, and Rand 
(1965), for example, have recently identified six major characteristics 
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of junior colleges: cultural affluence, technological specialization, 

size, age, transfer emphasis, and business orientation. Among other 
things, their results suggest that enrollment (size) and type of support 
(dichotomized as public-private) account for a major share of the known 
differences among the two-year institutions. On the basis of this finding, 
it was decided to stratify the 592 eligible two-year institutions first by 
mode of control (public versus private), and then by size. 

Sampling Within Cells 

In a strictly representative stratified-random sample, a fixed 
proportion (e.g., 15 %) of the institutions in each stratification cell 
would be picked to define the sample of institutions. This procedure was 
deliberately modified in several ways to protect against errors resulting 
from nonparticipation, to reduce the cost per individual student, to pro- 
tect against accumulating sampling errors in some of the more heterogeneous 
categories, and to reduce the risk of compounding errors in the aggregate 
student data. Thus, the universities were deliberately oversampled, since 
the peculiarities of just a few large institutions could introduce an 
appreciable bias into the student norms. Although using more large insti- 
tutions increases some of the logisitic problems, the risk of peculiarity 
effects is diversified over more institutions, with the data from any one 

institution receiving relatively less weight in the aggregate pooling 

* 

operations. In addition to oversampling the universities, institutions 
were oversampled in the end categories of affluence and enrollment to re- 
duce sampling error arising from the open-ended nature of these categories. 

The institutions were initially sorted into the appropriate 
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stratification cells, the cell members shuffled, and 371 institutions 
randomly chosen for the contact sample. (An expected rate of cooperation 
of 807o would yield about 300 participants.) The only departure from strict 
randomness was the deliberate inclusion in the 371 of 61 institutiono 
that had been selected from a similar stratification design for the 1965 
pilot study (Astin and Panos, 1966). The cell counts were adjusted ac- 
cordingly for the remaining sampling done at random within the stratifi- 
cation cells. An additional 25 institutions, not included as part of the 
sample, were also selected either by their own request or because they 
were known to have educational programs of some special interest to the 

research staff. 

In the spring of 1966 an invitation to participate in the study 
was sent by ACE President Logan Wilson to the presidents of each of the 
371 institutions. Positive replies were eventually received from 295 in- 
stitutions. Since only 16 of the original 371 institutions actually replied 
that they were unable to participate, the bulk of the nonparticipants 
consists of institutions that failed to respond at all either to the orig- 
inal invitation or to the two follow-up letters.* 

Although the actual rate of participation was almost identical to 
the expected rate of 80%, there was a large discrepancy between the rates 
for four-year and two-year institutions (85%, and 60% respectively). In 
particular, it appeared that the smallest of the two-year institutions 
were the least likely to participate. Since the sample of two-year insti- 
tutions is thus somewhat smaller than anticipated, several additional 

* We are indebted to Dr. Edmund J. Gleazer, Jr., executive director of 
the American Association of Junior Colleges, who kindly assisted us in 
enlisting the interest and cooperation of the two-year institutions. 
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two-year institutions will be invited to participate in the 1967 survey 
of entering freshmen. 

Figure 2 (see following page) shows the eligible population stra- 
tified into the 28 sample cells together with the number of participants 
in each cell. Since the data from a few of the participants may not be 
usable because of inadequate sampling of their entering freshmen classes, 
we expect that the actual number of participating institutions to be in- 
cluded in the 1966 norms will be a few less than 295. 

The disproportionate sampling from the various stratification 
cells requires that data from participating institutions in each strati- 
fication cell be weighted to equate the cell proportions with those of the 
defined population. Data collected within institutions will be further 
adjusted to correct for incomplete participation of individuals within 
institutions. The final set of weights will be presented in subsequent 
reports of normative data. 

In summary, the sampling of the four-year institutions appears 
to have been even more successful than expected. In the two-year insti- 
tutions, however, especially those under private control, a higher rate 
of nonparticipation was encountered than expected. In light of this 
experience, and in view of the fact that this segment of the population 
is rapidly changing, the sampling in subsequent years of the research 
program will provide for greater representation of two-year institutions, 
with appropriate changes in weights. 
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Kinds _of Data 

The most readily available source of information about higher 
educational institutions is the student. Students are, in some respects, 
a captive audience and have become accustomed to completing a variety of 
questionnaires, forms, tests, inventories, booklets, and the like. The 
considerable interest of researchers and administrators in student data 
(probably regarded by the students themselves as unnecessarily redundant) 

is easily understood. 

Presumably, an institution of higher education functions to help 
the student become an adult by providing appropriate and relevant exper- 
iences. Only by learning something about the student, and how he changes 
during college, can the people responsible for defining educational ob- 
jectives and for structuring particular learning experiences discover 
what their programs in fact accomplish. It is here, in studies focused 
on students, that the principal justification for the elaborate and ex- 
pensive system of higher education becomes evident. Information about the 
student and his development is, in short, the core of the research program. 

In addition to student data, there are at least four categories 
of information about institutions of higher education that have been con- 
sidered important, as is evidenced by the large amounts of literature 
reporting on or alluding to them! finances and financial policies, cur 
riculum; administrative policies and practices; and faculty. It has al- 
ready been demonstrated that collecting information in the first three 
categories, finances, curriculum, and administrative practices, is 
practical. For example, the American Council on Education's quadrennial 
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publications American Universities and Colleges and American Junior 
Colleges contain detailed data on endowment, operating budget, income, 
and a variety of institutional characteristics and programs. However, 
these data are currently available only in printed form. The incorpora- 
tion of these data into the master file will make these and related 
financial, curricular, and administrative data readily available in a 
useful form and might prompt further evaluation of what items of infor- 
mation are most valid fox' specific purposes. 

Information about faculty is generally not available, although 
it is obvious that studies of faculty work loads, preparation, and migra- 
tion, for instance, would be of great value to a wide variety of persons, 
agencies, and organizations interested in higher education. The reason 
for the gap is not clear: perhaps faculty are reluctant to provide infor- 
mation about themselves; perhaps researchers have deliberately disregarded 
this area. Nevertheless, Cartter's recent studies of faculty quality (1966) 
and Brown' s study of the college teacher market (1965) demonstrate that 
such information can be obtained. 

Conceptual Framework for the Research Program 

The history of science is the history of the application of 
inductive, inferential procedures to experiential data. The various methodo- 
logies for generating experiential data can be classified into three broad 
categories: experimental, quasi- experimental, and nonexperimental . 



- 17- 



Exp erimental procedures are characterized by the random assign- 
ment of the experimental units to the treatment conditions. Randomization 
is usually assumed to be both necessary and sufficient in order to avoid 
ambiguity in the interpretation of the relationship between the independent 
and dependent variables; that is, to eliminate the necessity of taking into 
account the effect of variables not part of the experiment, and to assure 
the validity of the application of statistical significance tests. However, 
a single isolated "true" experiment is often of limited usefulness, since 
replication of the experimental conditions on any substantial scale is 
rarely feasible in social settings. Without the possibility of such repli- 
cation, the value of experimental research is more theoretical than practical. 
Furthermore, it is seldom possible (or even desirable) to assign the experi- 
mental units of ultimate concern in educational research (students) at ran 

dom to various educational experiences. 

Qua s i- exp er iment a 1 procedures are characterized by the recog- 
nition that randomization is not possible, but that sufficient control 
either of the treatment conditions or of the selection biases can be in- 
troduced to rule out some of the alternative explanations of the results. 
Campbell (1957) has studied the problem of experimentation in social settings 
in great detail. In the Handbook of Research on Teachin g, he and Stanley 
(1963) outline a number of research designs that permit minimal bias in- 
ferences from such situations. Quasi, or socially relevant, experiments 
represent, perhaps, the only inferential paradigms applicable to the study 
of the impact of existing institutional programs (i.e., college environ- 
ments) upon the student. It should be noted, however, that even in the 
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ideal quasi- experimental setting it is not possible to ensure complete 
control of bias. 

Nonexperimental inference is characterized by the development of 
models interrelating variables of concern in a conceptually meaningful 
way, and by the testing of such models against the data. In this case, 
the concern is with formulating and fitting models to experiential data, 
no matter how observed, in which randomization or direct control of the 
treatments is deemed either not possible or irrelevant. 

In the Council' s research program, the primary unit of sampling 
is the institution. Nevertheless, the primary unit of concern 
is the student. Obviously, it is not possible to assign students at random 
to institutions. Furthermore, our entire program of research is designed 
to explore and to evaluate alternative methodological and theoretical ap- 
proaches to the measurement of college environments and to the assessment 
of their differential impact on the student. Therefore, the quasi-experi- 
mental and nonexperimental modes of inference are deemed more appropriate 
to our research program than is the traditional experimental mode of in- 
ference. 

The Research Model 

For the purposes of our research model, information about higher 
educational institutions can be sorted into three conceptually distinct 
categories: outputs , inputs , and operations . 

Outputs are the operational manifestations of educational objec- 
tives. Although these objectives can be expressed at very high levels of 
abstraction (for example, "the ultimate welfare of humanity"), we shall 
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be concerned initially with those relatively immediate objectives that 
can be assessed directly through research. More specifically, we are 
referring to the behaviors of the students and faculty that the higher 
educational institution is attempting to influence. In the case of the 
student, these would include his achievements, knowledge, skills, values, 
interests, personality, and behavior toward his fellow man. Faculty out- 
puts would include teaching competence, scholarly productivity, and job 
stability. (Although the rest of our discussion will, for simplicity, 
focus only on student outcomes, the model is equally applicable to studies 
of faculty.) Adequate measures of relevant educational outputs are, clearly, 
the sine qua non of meaningful educational research (Astin, 1964a) . 

Studies of student development in higher education have concen- 
trated on intellectual or cognitive outcomes (Fishman, 1962) , even though 
the educational enterprise is concerned with the student' s total personal 
development. Although the research program will utilize the standard 
measures of educational outcomes (grades., persistence in college, later 
vocational achievement), an important feature of the research will be the 
broadening and improvement of techniques for assessing student outcomes 
in the noncognitive or behavioral domain. New measures will be incorporated 
into the longitudinal data file as they are developed. 

Inputs are the talents, skills, aspirations, and other potentials 
for growth and learning that the student brings with him into the higher 
educational institution. These inputs are, in a sense, the raw materials 
with which the institution has to deal. In collecting input information, 
it is of vital importance to measure all variables that are likely to 
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affect the student' s subsequent performance on the various outputs under 
study. 

Operations are those aspects of the higher educational institution 
that are capable of affecting the development of the student. These in- 
clude administrative policies and practices, curriculum, physical plant and 
facilities, teaching practices, peer associations, and other characteristics 
of the college environment. Although some progress in the assessment of in- 
stitutional environments has been made in recent years, the measurement of 
the college environment is still in a relatively primitive state both con- 
ceptually and methodologically. Consequently, one of the major goals of 
this research program is to develop and to test improved measurement tech- 
niques relevant to the problem of the college environment and its effect 
upon the student in a manner that will permit the statement of, and the 
testing of, rival hypotheses. 

In contrast to previous research on college environments, we view 
the college environment simply as a set of potential stimuli. The term 
"stimuli" refers here to those events or observable characteristics of the 
college that are capable of changing the sensory input to the student attend- 
ing the college. The basic task, then, is to identify observable events or 
characteristics of the institution that could serve as possible stimuli to 
the student. Our "stimulus"' rationale can perhaps be made clearer by a 
comparison with previous approaches to assessing the college environment. 

The work of Pace and Stern (1958) and the later work of Pace (1964) 
and Thistlethwaite (1960) exemplifies the impressionistic or "image" approach 
to assessment of the college environment. In this approach, the student is 
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asked to rate his environment by means of a set of items similar to those 
typically found in personality inventories. Although a few of the items 
in these inventories are relatively objective and unambiguous, the majority 
of items ask for subjective judgments and impressions from the student ob- 
servers concerning the total "climate" of the institution. In the work done 
by Astin and Holland (1961) with the Environmental Assessment Technique (EAT), 
the college environment was assessed through knowledge of the personal char- 
acteristics of the students at the institution. It can be seen that neither 
of these sources of information (impressions of the environment, and personal 
characteristics of the students) adequately meets the criterion of a poten- 
tial stimulus. Thus, while the student's subjective impression of his 
college environment may give rise to certain behaviors that can in turn 
serve as stimuli for other students, the subjective impression per se does 
not constitute a stimulus. Similarly, neither the student s degree of in- 
telligence nor his personality characteristics constitute a stimulus by 
our definition, although these traits may be manifest in certain typical 
behaviors that can then affect his fellow students. 

A model for inter inst itutional research on student development 
based on these three types of information is shown below: 




Student Inputs 
(talents, aspirations) 



C 



■> 



Student Outputs 
(goals of the educational 
enterprise) 
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The principal objective in our research program is to determine 
how the various educational environments represented by higher educational 
institutions (operations) affect the performance of the student (outputs) . 
We are, therefore, primarily concerned with relationship ]3 in the dia- 
gram shown above. From a methodological point of view, however, a thorough 
knowledge of relationships A and C is required before we can adequately 
interpret relationship B . 

With respect to relationship C, the experience of many years of 
research on predicting human performance shows that the student' s output 
performance will be determined, in part, by his input characteristics. 

More simply: the student's talents and aspirations when he enters college 

will play a major role in determining what he is able to learn and the 
kind of person he eventually becomes. 

But it is the presence of relationship that complicates the 

design. It has been established in several major empirical studies that 
certain characteristics of the college environment are closely related to 
student input characteristics (Astin, 1963b, 1965c, d; As tin and Holland 
1961) . The student input, therefore, is likely to be related both to 
output and to the educational operations. Given this dual relationship, 
it is possible for a significant relationship _B to be mediated simply 
by differential student input to the various environments. 

This discussion makes it clear that any obtained relationship 
between educational practice and student output is necessarily ambiguous 
so long as no control is exercised over differential student input. The 
basic research strategy for dealing with this problem is modeled after 
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several recent studies of differential college influence (Astin, 1962b, 
1963b, 1963c, 1964a, 1965d; Nichols, 1964). First, an ’’expected" output 
based on the student’ s input characteristics is computed. The effect of 
this expected output is then removed from his observed output, producing 
a "residual" output which is now independent of input: 

Output - Expected Output = Residual Output 
(based on input) (now independent 

of input) 

The final steps in the analysis are to relate the residual output to the 
various environmental characteristics, and to search for person- environ- 
ment interaction effects. 

Because of the importance of the design used in these quasi- 
experiments, a continuing function of the program will be the improvement 
of techniques for controlling differential student inputs and for identi- 
fying significant interactions between student attributes and environmental 
characteristics. Our eventual goal is to identify those environmental 
variables that are most important in affecting the development of both 
students and faculty. 



The Freshman Information Form 

The Freshman Information Form is designed to serve two functions: 
first, to obtain standard data for immediate informational purposes; and 
second, to obtain student input data for research purposes. Thus it con- 
tains both basic biographical and demographic items that can be collected 
annually from each entering class, and a number of more research- oriented 
items which can be modified regularly in order to cover the widest possible 
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range of student outcomes. This plan represents a compromise between the 
requirements of standardization and comparability of obtained information 
on the one hand, and, on the other, the desirability of maintaining flex- 
ibility in research tactics and approaches. The research program should 
not become a vehicle for promoting any single test or other measurement 
instrument . 

In order to ensure that the basic demographic information items 
reflected the needs and inclinations of the participating institutions as 
closely as possible, the 1965 pilot version of the Freshman Information 
Form was developed in close collaboration with members of the executive 
committee of the American Association of Collegiate Registrars and Admis- 
sions Officers. The final form, which was administered to the 1965 entering 
freshman classes at 61 institutions, included 14 items of basic demographic 
information, 13 items concerning educational and vocational plans, 21 self- 
ratings, and 57 behavorial stimulus items developed in previous research 
on college environments (Astin, 1965b). Additional modifications were made 
in the form to be used in 1966 as a result of a conference of representa- 
tives from the 61 pilot institutions held shortly after the dissemination 
of reports based on the 1965 data. 

Even though a certain degree of standardization of content from 
year to year is necessary in order to study trends in demographic 
characteristics of entering classes, it is difficult to overemphasize the 
importance of maintaining flexibility in much of the content of the form. 
Only in this way will it be possible either to pursue promising research 
leads in greater depth or to explore the potentialities of new ideas, 
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hypotheses, and techniques. For these reasons, criticisms and suggestions 
for modifications will be solicited each year from leading educational 
researchers and administrators for the design of the new edition of the 
form. 

Follow-Up Forms 

The purpose of follow-up surveys will be to collect the output 
data needed to create the longitudinal records that will be used in various 
research projects. Often the follow-up information will consist simply 
of post- tests on input pre-test items in the earlier Freshman Information 
Form. Follow-up forms will also be used to collect information about 
the college environment. 

Information collected through follow-up forms can also be used for 
purely descriptive purposes, such as the monitoring of trends in student 
attrition, rates of transfer, choice of different careers, and the pursuit 
of graduate training. 



Major Uses of the Data Files 

To contribute substantially to educational policy and practice is 
the most important long-range function of the program. The ultimate goal 
of the research is to provide educational administrators, teachers, and 
others concerned with educational policy with a sound body of empirical 
knowledge concerning the relative impact of various educational practices. 
Results of completed projects will be disseminated by means of monographs, 
books, articles in professional journals, and papers presented at meetings 
of professional societies and educational organizations. 
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Research 

Although the American Council on Education's staff will conduct 
a wide variety of continuing longitudinal studies with the data, the nature 
of the files is such that their full research potential could never be ex- 
ploited fully by the in-house staff. Accordingly, the Council will regu- 
larly invite researchers from other organizations and institutions to spend 
some time at the Council in order to pursue their special research interests. 

The program provides an opportunity for a wide variety of substan- 
tive studies in higher education. Some of the areas of research that can 
profitably be explored are: 

S tudies of Student Development : Effects of different college en- 

vironments on the student' s career choice, personality development, mental 
health, and educational aspirations; factors affecting student dropouts, 
including the later vocational development of the dropout. 

Studies of Special Educational Programs : Effects of various types 

of governmental and foundation support on the educational environments of 
departments and institutions; effects of honors programs. 

Manpower Studies : Trends in the career aspirations of students 

over time; trends in faculty migration; factors influencing the recruit- 
ment and retention of faculty. 

Studies of Teaching Practices : Development of techniques for 

evaluating teaching proficiency; effects of specific teaching practices 
on student development. 



The data files also provide an opportunity for collaborative re- 
search involving data collected by investigators in other organizations. 
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The student input data that will be collected each fall, for example, 
might be linked with data collected earlier from the same students by 
one or more of the large testing organizations. Such a merging of files 
would permit longitudinal studies covering periods of time other than 
the college years. 

Since the files will provide data for estimating what differ- 
ential weights should be applied in order to control for sampling 
bias, the many institutional studies in higher education now being made 
by other investigators with accidental and other unrepresentative samples 
can be improved. The existence of standard items of information will 
also allow investigators to reduce the redundancy of various research 
questionnaires. If arrangements can be made for exchanging data with 
the research organizations, students will not be asked to provide the 
same information over and over again; the outside research organization 
will merely have to obtain the "key" information needed in order to link 
up his records with those in the ACE data files, and the available 
standardized biographical and demographic data, along with other items 
of information required for his research, will be available to him. Thus, 
a significant savings in time and added convenience to other researchers, 
students, faculty, and institutions may be effected. 

Information 

Each participating institution will receive a tabulation of data 
on its entering class and norms for the entire student population and 
for several subclasses of institutions. The norms will also be 
available to other interested organizations and individuals. 
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Experience with the 1965 pilot study indicates that the institutional 
reports and national student norms can be issued before the end of the 
fall during which the data are collected. 

Over-all, the data files will provide factual information 
ranging from more or less idiosyncratic and specific items of knowledge 
to all the information in the files on a given topic at a given time. 

The Office of Research will routinely publish reports of various norma- 
tive data from the files. In addition, it will be able to fill 
requests for specific information made by educational institutions and 
outside agencies. 

In order to provide users with ready access to the data files, 
the staff of the Office of Research will prepare a library of flexible 
programs that will perform many of the standard types of data manipula- 
tions likely to be needed. This "software package," which will probably 
be operational before the end of 1967, will include routines for 
computing summary statistics, cross- tabulations, and multivariate analyses 
of the files. It is our intention to automate outside users' requests 
for special analyses of the files by developing a system of data files 
and related software that is thoroughly documented for use by others. 
Although an automated data accessing system such as this one requires 
the potential user to fit his special requests to the available file 
arrangement and software, it has the advantages of permitting easy and 
rapid access to the files and of requiring the user to define his needs 
in very explicit terms. 

Although all the types of requests for information cannot be 
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anticipated, some examples of frequent requests (which might require 
expensive surveys were the information needed but not readily available) 
are: personal characteristics of students entering various types of 

institutions; analyses of how various types of students finance their 
education; trends in student choices of careers; and the distribution of 

scholarship and fellowship funds. 

These examples merely suggest the wealth of information services 
that can be performed with the data files. It is apparent that the 
availability of standard biographical and demographic information together 
with other research information collected from the students at a repre- 
sentative national sample of higher educational institutions would be of 
considerable use in educational planning and policy. 

Training 

Much of the training of professional educational researchers 
takes place in the university setting. The impetus provided by federally 
financed programs for setting up committees of educational measurement 
and statistical analysis has given rise to many new graduate programs in 
educational research and methodology. A primary function of such programs 
is to provide the student with a sound theoretical and methodological 
background. One of the trainee's most frequent complaints is that this 
emphasis is "too theoretical." This dissatisfaction gives rise to both 
a demand and a need for practical applications. Since the nature of 
educational problems often dictates the use of large samples and longi- 
tudinal designs, it is likely that the data files can be used as both a 
research tool for predoctoral and postdoctoral fellows, and, as we have 
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already suggested, a vehicle for the pursuit of specialized research 
problems by experienced researchers from other organizations and insti- 
tutions. The potential researcher can thus have an opportunity to apply 
his theoretical and technical skills to substantive problems in higher 
education, even before his formal training is actually completed. 

Another important training function currently planned in 
connection with the program is an annual conference of researchers 
engaged in interinstitutional studies of higher education. These 
conferences should be useful in planning for the program and in stimu- 
lating communication and collaborative research among different investi- 
gators. 

Research Projects Currently Planned or Under Way 
In this section we provide brief descriptions of projects 
currently under way or planned by the American Council on Education 
staff. It should be stressed, however, that the data files are 
designed to support an ongoing program of studies. The data on educational 
inputs, outputs, and environments now being incorporated into the files 
will make it possible to conduct a variety of other longitudinal studies 
quickly and at a relatively small cost. If a particular research 
hypothesis cannot be tested adequately with available data, the appro- 
priate new items can be substituted in the flexible portion of the 
freshman information form. 

Origin and Development of the College Environment 

A recent theoretical development in higher education research 
is the concept of the college environment as a "stimulus." This theory has 
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led to the construction of the Inventory of College Activities (ICA), an 
instrument that measures the college environment primarily in terms of 
the frequency of occurrence of various student behaviors. Research with 
the ICA has shown, in brief, that colleges differ widely in the 
frequency with which their students exhibit various forms of behavior. 

An important question, both theoretically and practically, concerns how 
these different behavioral patterns develop. Are college environments 
simply a reflection of the types of students initially recruited, or are 
patterns of student behavior shaped by certain administrative policies, 

curricular practices, or other factors? 

In the proposed study we shall attempt to explore these questions 

by studying changes in patterns of student behaviors as assessed by the 
ICA prior to college and after one and two years in college. Selected 
student behaviors from the ICA, which were included in the 1965 pilot study 
of entering students, have also been included in the 300-institution 

input study now under way for 1966. 

During the summer of 1967, these same behavioral items will be 

repeated in a follow-up questionnaire that will be sent to students from 
both samples. Analyses of data and completion of final reports are 

expected by December 1968. 

An Exploratory Study of the Process of College Choice 

One of the critical choice points in the educational development 
of a student is his selection of a college. Although great amounts of 
time and effort are expended annually by counselors, students, and parents 
in deciding on the "right" college, very little is known about how these 
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choices are made. 

If we are to increase our general understanding of this complex 
decision process, we must learn more about (a) the kinds of information 
about colleges that are typically available to the high school student; 
and (b) the sources or channels of communication for this information. 
Initially, we propose to seek answers to the following questions: How do 

particular colleges first come to the attention of the prospective college 
student? What is the relative importance of parents, friends, guidance 
counselors, and others in directing students toward given colleges? What 
kinds of information about the college environment are typically available 
to the student before he actually enrolls? In what areas are the pros- 
pective college student's expectations about his college's environment 
most likely to be inaccurate? Does the accuracy of the student's expec- 
tation about his college vary by type of college, type of student, or 
informational source? 

Items to provide answers to these questions have been included 
in the 1966 student input survey. The student' s perception of his college 
environment will be assessed by means of scales developed in previous 
research with the Inventory of College Activities. It is expected that 
this study will be completed by June 1967. 

Attrition Among C ollege Students 

The attrition of college students is an important criterion for any 
program of research in higher education. Although the term "dropout" is a 
negative value judgment- -imp lying both failure and loss-- the relevance of 
attrition to educational planning and practice is manifest from the time 
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and money spent in studying it. Dropping-out connotes, at one level, 
individual educational failure; that is, failure either of the student 
involved or of the educational system. At a broader social level the 
dropout problem implies a more far-reaching detriment because the pre- 
sumed talent will not be available to our society in the form of trained 
manpower . 

Research on dropouts will follow the general input- environment- 
output model described earlier. This continuing project will include 
several alternative methods of defining the dropout criterion, and pro- 
visions for testing the validity of several rival theories commonly ad- 
vanced to account for student attrition. For example, we have included 
in the 1966 Freshman Information Form items about the entering student' s 
plans for marriage, his degree of concern about college finances, and the 
sources he expects to call upon to finance his undergraduate education. 
These variables are often cited as ex post facto arguments for dropping 
out of college. For the first time, they are being included as input 
control variables in a longitudinal study of college student attrition. 
These variables, together with other student personal variables and en- 
vironmental measures, will provide a frame of reference to which the 
later behavior of the dropout can be related. The principal objective 
of the study is to identify the personal and environmental factors asso- 
ciated with attrition, with special emphasis on those environmental 

factors that can be manipulated. 

Student input data are being collected from the 1966 entering 
students at the sample institutions, with initial criterion data scheduled 
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for collection in the fall of 1967. Preliminary findings from the initial 
one-year longitudinal study will be completed by June 1968. The general 
plan for the research program is to monitor the students progress regu- 
larly by means of periodic follow-ups, and to use subsequent student input 
surveys to explore more thoroughly the validity of promising predictor 
variables . 

Career Choice and the College E nvironment 

The undergraduate institution is one of the principal mechanisms 
for channeling and developing skilled manpower. The potential importance 
of the college as a determinant of the manpower supply in various fields 
is illustrated by the fact that more than half of the students who complete 
their undergraduate education receive degrees in fields different from 
the ones in which they began their studies. Following the methodology 
outlined previously, we shall attempt to identify factors in the college 
environment that influence the student 1 s choice of a career and f lela of 

study. 

One of the principal limitations of current theories of career 
development is the inadequate treatment given to environmental factors. 
With a few exceptions, these theories are exclusively psychological in 
conception, with only cursory consideration given to the role of environ- 
mental constructs in the career development process. One major theoret- 
ical hypothesis to be explored in this project is based on a theory of 
selective environmental reinforcement. Briefly, this hypothesis states 
that the student 1 s career choice tends to shift in the direction of the 
dominant or modal choice of his fellow students. 
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A second hypothesis, related to current theories of career devel- 
opment, concerns the role of the self concept. Specifically, we shall 
attempt to (a) determine whether changes in career plans are accompanied 
by appropriate changes in the student's self concept, and (b) explore the 
role of environmental factors in mediating such changes. 

Input data for this project were collected in the fall of 1965. 
Follow-up data will be collected four years later, in the spring of 1969, 
at the expected time of graduation from college. Completion of the final 
report is expected by June 1970. 

Methodological Study of Hierarchical Grouping Models for Taxonomy 
of Institutions 

Typically, a priori classification along some obvious "dimensions" 
is used in educational research, although such dimensions have their 
major justification for administrative rather than research purposes. 
Examples of this type of classification are colleges versus universities, 
or publicly controlled versus privately controlled institutions. Alter- 
native classifications, however, may be more useful in some educational 
research, especially where these classifications are based on objective 
measurements of institutional characteristics, A crucial consideration 
in the use of these empirical models is the sensitivity of the resulting 
classification to the nature of the input data and to the choice of 
paired-comparisons measures derived from such data. Certain other charac- 
teristics of the models and in the strategies of their application in 
classifying large numbers of objects (i.e., institutions) require exam- 
ination and comparison with related models (such as the Leiman-Schmid 
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hierarchical factor analysis model). Outcomes of the various strategy 
choices require comparison with each other and with existing a priori 
classifications . 

In the proposed methodological study of the hierarchical grouping 
models, we shall utilize all four-year institutions included in the sample. 
Empirical bases of classifications to be compared will include measures of 
traditional a priori variables, freshman input factor variables, environ- 
mental orientation factor variables, and combinations of these. The re- 
search will explore the possibility of obtaining control of input factors 
by stratification of institutions along dimensions so defined. Where 
different classifications result, normative data from the research files 
will be extracted for the alternative types. It is anticipated that 
the methodological studies in the earlier phases of this research can 
be completed by December 31, 1966, and that subsequent use of the results 
in working with the substantive data files can proceed during the following 
year. The earlier phases of detailed planning and of adopting available 
computer programs are currently under way. 

The Use of Co-Twin Controls for Analysis of Nature-Nuture Effects in 
Educational Research Data 

Because of numerous methodological and conceptual difficulties 
encountered over the years, many educational and psychological researchers 
have tended to relegate the old "nature-nuture" controversies to the 
realm of limbo. Since the proposed research program is designed to create 
data files specifically for research purposes rather than merely to 
collect data already obtained from various sources and for various purposes, 
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data bearing on these old questions will be obtained on a large control 
sample. Since over 200,000 subjects will be involved in the 1966 input 
survey, some 2,000 sets of twins may be expected in the resulting files. 

A recent methodological contribution, consisting of a zygosity-diagnosis 
questionnaire with an accuracy of 93% validated against extensive blood- 
typing discrimination of twin zygosity, will be used to obtain a low- 
cost, highly reliable discrimination in the file twins. Assuming about 
3 dizygotic to 1 monozygotic pairs in our files, and ignoring differential 
death rates and institution attendance rates, we may expect 1,500 sets 
of fraternal twins and 500 sets of identical twins. By Mendelian laws, 
we may expect 375 each of male fraternal pairs and female fraternal sets 
and 750 male-female fraternal sets. We may also expect 250 each of male 
identical sets and female identical sets. In any case, we will know actual 
numbers from the data, and it is apparent that sufficient numbers of 
various types of twins should be available for additional factors to be 
added to analysis of data, and still have enough cases per cell to 
stabilize statistics. 

To the extent permitted by the actual counts, nature-nuture 
differences will be examined with respect to the following variables: 
college choice j academic achievement and academic ability; career and 
higher educational aspirations, including choices of field; accomplish- 
ments in high school; perceptions of the college; stated interests and 
values; and changes observed from follow-up data. 

It is important to note that genetically determined factors are 
necessarily inputs to the educational process and presumably interact 
with earlier environmental influences. Such interactions are in part 
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ref lected in the student input data obtained at the time of entrance to 
college. By controlling environmental information in the twin sets, as 
this is obtained through special follow-ups of the twin samples, it is 
expected that a residual nature composite may be derived and used to 
provide a sharper separation and control of confounding effects and inter- 
actions in other phases of the program. 

Initial identification and count of twins obtained in the basic 
files should be completed by December 31, 1966, with follow-up zygosity 
diagnoses and special background questionnaires sent to the twin samples 
during the second semester of the academic year. Final definition of the 
twin samples with zygosity discrimination and initial analyses will be 
completed by June 1967 with a report of preliminary results. Further 
analyses of the twin data files will then be initiated in the fall of 
1967. 

Correlates of Birth Order Among Student Outcomes 

Recently researchers have been giving more attention to the effect 
of ordinal family position on the academic achievement and later vocational 
development of the student. Although a large number of studies indicate 
the pervasiveness of '’birth order effects," the suggested causal relation- 
ships are often not testable in the context of the particular study or 
are contradicted by other research. This result follows from the fact 
that birth order data are usually collected as ancillary information, 
without deliberate design, and are then correlated with other data. Thus 
one of the principal limitations of current hypotheses concerning effects 
of birth order is the inadequate data collected. Furthermore, these data 
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are not obtained in a context that permits testing alternative hypotheses. 

During the fall 1965 pilot study, birth order data were collected 
from 42,000 entering students. Preliminary analyses of these data indicate 
that it is necessary to obtain information with regard to age and sex dis- 
tributions of siblings within family sizes. This information is needed 
in order to define and explore the effects of early environmental inter- 
actions within the home. These interactions, it is suspected, may offer 
a major alternative explanation for so-called birth order effects. 

The preceding considerations led to the formulation of the birth 
order item as it appears in the 1966 Freshman Information Form. Input data 
concerning ordinal family position will be collected in the fall of 1966. 

This information, together with other personal and environmental measures, 
will be explored within the framework of a number of hypotheses derived 
from physiological, sociological, and economic theories. The principal 
objective of the study is to determine the correlates of birth order 
effects among student outcomes and to provide insights into the processes 

underlying the observed effects . 

Preliminary findings from the 1966 data will be completed by the 
fall of 1967. Follow-up data collected from the 1966 sample will be in- 
eluded in further analyses of the birth order data in order to document 
relationships from the earlier analysis with later student developments. 

These outcomes will include such student behavior as dropping out of college, 
final career choice, and length of time required to obtain a degree. 



- 40 - 



Summary 

In this paper we have presented a plan for a broad program of 
continuing longitudinal research on the American higher educational system. 
This research program will be based primarily on a comprehensive file of 
information updated annually from a representative sample of higher edu- 
cation institutions. The file will contain detailed longitudinal informa- 
tion concerning the students and environments of the participating insti- 
tutions in a readily available and accessible format. 

The research data file is designed to serve three basic functions 
research, information, and training. It is expected that the data files 
will be utilized as a research tool by other educational organizations and 
individuals concerned with higher education. The standardization and 
resulting comparability of data and the flexible nature of the research 
program outlined here should make it possible for agencies involved in 
massive data collection procedures to move more rapidly toward coordination 
of their own activities and toward cooperation with other agencies per- 
forming similar functions. 
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